rank | frequency | n-gram |
---|---|---|
1 | 20509 | -а |
2 | 18852 | -н |
3 | 5290 | -ш |
4 | 4327 | -ь |
5 | 4258 | -о |
rank | frequency | n-gram |
---|---|---|
1 | 9942 | -ан |
2 | 4431 | -ра |
3 | 4285 | -на |
4 | 3616 | -хь |
5 | 3178 | -йн |
rank | frequency | n-gram |
---|---|---|
1 | 2439 | -ийн |
2 | 2086 | -ехь |
3 | 1895 | -кан |
4 | 1803 | -ран |
5 | 1274 | -нан |
rank | frequency | n-gram |
---|---|---|
1 | 1200 | -скан |
2 | 856 | -аран |
3 | 646 | -ашна |
4 | 626 | -кахь |
5 | 454 | -ашца |
rank | frequency | n-gram |
---|---|---|
1 | 608 | -шкахь |
2 | 481 | -вскан |
3 | 343 | -нскан |
4 | 240 | --Сити |
5 | 233 | -аллин |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings